A Taxonomy Framework for Unsupervised Outlier Detection Techniques for Multi-Type Data Sets

نویسندگان

  • Yang Zhang
  • Nirvana Meratnia
  • Paul Havinga
چکیده

The term “outlier” can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining of Magnitude and Direction of Change Indices to Unsupervised Change Detection in Multitemporal Multispectral Remote Sensing Images

In remote sensing, image-based change detection techniques, analyze two images acquired over the same area at different times t1 and t2 to identify the changes occurred on the Earth's surface. Change detection approaches are mainly categorized as supervised and unsupervised. Generating the change index is a key step for change detection in multi-temporal remote sensing images. Unsupervised chan...

متن کامل

Outlier Detection in WSN- A Survey

In the field of wireless sensor networks, the measurements that deviate from the normal behaviour of sensed data are taken to be as outliers. The potential sources of outliers can be noise and errors, events, and malicious attacks on the network. This paper give an overview of existing outlier detection techniques specifically developed for the wireless sensor networks. Also, a technique-based ...

متن کامل

A Multi-Stage Intrusion Detection Approach for Network Security

Nowadays, the massive increment in applications running on a computer and excessive in network services forces to take convenient security policies into an account. Many methods of intrusion detection proposed to provide security in a computer system and network using data mining methods. These methods comprise of the outlier, unsupervised and supervised methods. As we know, each data mining me...

متن کامل

Penalized unsupervised learning with outliers.

We consider the problem of performing unsupervised learning in the presence of outliers - that is, observations that do not come from the same distribution as the rest of the data. It is known that in this setting, standard approaches for unsupervised learning can yield unsatisfactory results. For instance, in the presence of severe outliers, K-means clustering will often assign each outlier to...

متن کامل

Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm

Unsupervised anomaly detection is the process of nding outliers in data sets without prior training. In this paper, a histogrambased outlier detection (HBOS) algorithm is presented, which scores records in linear time. It assumes independence of the features making it much faster than multivariate approaches at the cost of less precision. A comparative evaluation on three UCI data sets and 10 s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007